Immune Cell Infiltration Across Breast Cancer Subtypes: A Marker Gene Analysis of the METABRIC Cohort

Author

cyy36@cam.ac.uk

Published

February 24, 2026

Cover Page

MSt Healthcare Data Science

Authentication of practice

I confirm that I have fully read and understood the assignment brief for this module. Y

Details

Name: Chung Yan Surname: Yu
Submission Date
Word Count: whole assignment including codes
Word Count: main body excluding abstract, references and supplementary materials

Permission to share your assignment

I do give permission to share my assignment with future MSt participants.

University statement of originality

This assignment is the result of my own work and includes nothing which is the outcome of work done in collaboration except as declared in the Preface and specified in the text. It is not substantially the same as any that I have previously submitted for a degree or diploma or other qualification at the University of Cambridge or any other university or similar institution, or that is being concurrently submitted, except as declared in the Preface and specific in the text. I further state that no substantial part of my Portfolio has already been submitted, nor is being concurrently submitted for any such degree, diploma or other qualification at the University of Cambridge or any other university or similar institution except as declared in the Preface and specified in the text.

I confirm the statement of originality as above Y

Questions for reflection

Self-assessment is an important aspect of feedback literacy, which is, in turn, key to the development of expertise. As you proceed through the MSt Healthcare data science programme, we hope that you will make use of the following prompts to assess your own work on assignments. Specific assignment briefs will likely indicate which of these to address for which assessments, but, in general, we expect you to respond to one or two for each assignment on your course.

For each of the questions, do not spend too long answering – keep it brief. For each question you answer, limit yourself to no more than three items. And please remember, this is optional and developmental: these cover sheets are designed to create space for self-assessment and feedback dialogue, rather than additional assignment workload.

  1. Which aspects of this assignment are you most uncertain about and/or would most like to receive feedback on?

  2. What elements are you left pondering after this assignment that you would like to discuss further?

  3. How have you incorporated feedback from peers and tutors into this assignment?

  4. How, and to what extent, have you been able to incorporate feedback on previous course work into this assignment?

  5. Using the wording in the rubric, how would you describe the quality of the different aspects of your work?

Declaration of the use of generative AI

Which permitted use of generative AI are you acknowledging? Semantic search of literature and notes, outline creation, output formatting, informed feedback, code generation
Which generative AI tool did you use (name and version)? Claude Code v2.1.1x (Opus 4.6), M365 CoPilot
What did you use the tool for? Searching for relevant journal publications (via PubMed MCP), collating notes in my Obsidian vault, .bib file curation, code debugging+generation
How have you used or changed the generative AI’s output My collated notes always stay in point form so that I write out the paragraphs in my own words. Feedback provided by the GenAI models are weighed and assessed before being acted on.

Abstract

The ability for immune cells to infiltrate the tumour microenvironment (TME) plays a key role in the biological war machine taking on breast cancer, so much so that such infiltration has prognostic value. Here, immune infiltration of 14 immune cell types in breast tumour samples of the METABRIC cohort was scored using the expression level of marker genes, to provide an extensive view of how immune populations associate with survival across molecular subtypes. Immune variation was assessed across PAM50 subtypes, and prognostic associations were evaluated using univariate and PAM50-adjusted Cox proportional hazards models. K-means clustering tested if aggregating immune profiles improved prognostic value beyond individual cell types.

Of 14 cell types, 12 showed significant variation across PAM50 subtypes, with Basal-like and HER2-Enriched subtypes showing the highest lymphocyte infiltration. Immune-mutation associations were largely confounded by PAM50 subtype, with TP53 mutations driving most infiltration signals. Univariate Cox models identified mast cells (HR 0.885) and macrophages (HR 1.091) as prognostic. The adjustment for PAM50 subtypes stripped macrophages of prognostic significance, and revealed protective associations for B cells (HR 0.889), cytotoxic cells (HR 0.903) and CD8 T cells (HR 0.914), with mast cell associations slightly reduced (HR 0.911). Infiltration of mast cells deviates from the lymphocytic patterns, showing enrichment in Luminal rather than Basal-like tumours, consistent with hormone-driven SCF/c-Kit recruitment. Immune clustering failed to improve on single cell type models when subtype was factored as a covariate.

Immune infiltration showed heterogeneity across cell types and PAM50 subtypes, with subtype-independent prognostic value in breast cancer. B cells and mast cells represent two axes of immune variation with distinct biological drivers and consistent protective associations. Individual cell scores outperformed aggregated profiles, reflecting the diversity in immunological protective pathways.

Introduction

Tumour cells evading the detection and wrath of the immune system are a recognised hallmark of cancer1, and breast cancer is no exception2. The interplay between tumour cells, immune cells and other local factors forms the tumour microenvironment (TME), and research in the TME of breast cancers has yielded clinical applications3. The amount of tumour-infiltrating immune cells in breast cancers has shown prognostic value4, and this amount and prognostic impact varies across breast cancer’s molecular subtypes5. These subtypes are defined by gene expression6, and are popularised and commercialised as the PAM50 gene expression panel, which has its own prognostic significance7. The subtypes include Luminal A, Luminal B, HER2-Enriched, Basal-like and Normal-like.

The METABRIC cohort from Curtis et al.8 is one of the larger datasets on breast cancer containing clinical, gene expression and mutation data of 2,509 patients. The original METABRIC analysis showed immune infiltration in one of their integrative clusters (IntClust4), showing deletions in T cell receptors and mRNA amplification for immune cell activity, but stopped short of fully characterising the immune cells involved. Jiang et al.9 characterised immune cell infiltration in the METABRIC cohort using CIBERSORT10, developing new immunotypes with prognostic significance. Ali et al.11 applied the same deconvolution approach to a larger multi-study cohort including METABRIC, finding that immune associations with outcome depended on estrogen receptor (ER) status. However, CIBERSORT is more of a black box model, a computational approach using deconvolution to estimate cell proportions from RNA mixtures. Therefore, there is a need to use a transparent approach to score immune cell infiltration in METABRIC’s breast tumour samples, and scrutinise any prognostic value from such scoring.

A panel of immune cell marker genes developed by Danaher et al.12 was used to score immune infiltration in the METABRIC cohort. The infiltration of different immune cell types as well as their prognostic value were investigated across the PAM50 subtypes. The goal was to characterise the immune landscape across PAM50 subtypes, identify which immune populations associate with survival, and test whether immune cell scores or multi-dimensional immune profiles improve prognostic value beyond subtype classification alone.

Methods

Click here to see how to pull fresh data straight from cBioPortal.
# To pull fresh data from cBioPortal instead of using the included processed files,
# delete the data/processed/ folder and re-render this document.
source("scripts/00-data-cleaning.R")
source("scripts/01-immune-cell-scoring.R")
source("scripts/02-task1-exploration.R")

This study makes use of the METABRIC dataset8 available on cBioPortal1315, including clinical, gene expression and mutation datasets of 2,509 patients. For gene expression, the log2 intensity was used in place of the normalised z-score to preserve the proportional relationships when applying the immune cell scoring.

The principle of avoiding circularity guided two data selection decisions. First, PAM50 subtypes were chosen over IntClust, as IntClust subtypes are defined by copy number aberrations that can directly influence immune marker gene expression, and IntClust4 is already characterised by immune infiltration. The clear lack of overlap between the PAM50 and Danaher’s immune cell gene panels also strengthened this choice. Second, the exclusion of Claudin-low patients. Claudin-low is another breast cancer subtype used alongside PAM50 subtypes on cBioPortal’s METABRIC dataset, and is characterised by immune infiltration16. Fougner et al.16 further demonstrated that Claudin-low is not an independent subtype in addition to the PAM50 subtypes, therefore patients of this category (n = 218) were removed.

Immune cell scoring is done by taking the mean log2 gene expression of the cell types’ marker genes, originally defined by Danaher et al.12 and tabulated in Table S1. Of the 60 marker genes from the Danaher panel, only 58 genes have expression data in the METABRIC dataset. TPSB2 is missing for mast cells and XCL2 for NK cells. The CD4 T cell score is derived by subtracting the CD8 T cell score from the overall T cell score. The Danaher scoring method was chosen for its simplicity and marker genes validated against flow cytometry, in contrast to deconvolution approaches previously applied to this cohort9.

Cell type Markers Found Genes used Genes missing
B-cells 9 9 BLK, CD19, FCRL2, MS4A1, FAM30A, TNFRSF17, TCL1A, SPIB, PNOC
CD45 1 1 PTPRC
CD8 T cells 2 2 CD8A, CD8B
Cytotoxic cells 10 10 PRF1, GZMA, GZMB, NKG7, GZMH, KLRK1, KLRB1, KLRD1, CTSW, GNLY
DC 3 3 CCL13, CD209, HSD11B1
Exhausted CD8 4 4 LAG3, CD244, EOMES, PTGER4
Macrophages 4 4 CD68, CD84, CD163, MS4A4A
Mast cells 5 4 TPSAB1, CPA3, MS4A2, HDC TPSB2
Mature NK 4 4 KIR2DL3, KIR3DL1, KIR3DL2, IL21R
NK cells 3 2 XCL1, NCR1 XCL2
Neutrophils 7 7 FPR1, SIGLEC5, CSF3R, FCAR, FCGR3B, CEACAM3, S100A12
T-cells 6 6 CD6, CD3D, CD3E, SH2D1A, TRAT1, CD3G
Th1 cells 1 1 TBX21
Treg 1 1 FOXP3
NoteTable S1 — Immune marker gene coverage

Verifying the immune marker gene coverage on the Illumina HT-12 v3 microarray of the METABRIC dataset.

Immune variation across PAM50 subtypes was assessed by Kruskal-Wallis test with Wilcoxon for pairwise comparisons. Immune-mutation associations were tested via Wilcoxon rank-sum for genes mutated in ≥ 10 patients to provide statistical power; these were further adjusted for PAM50 subtypes via linear regression. Prognostic associations were evaluated using Cox proportional hazards models, first univariately and then multivariately to adjust for PAM50 subtypes, along with Kaplan-Meier curves with log-rank tests for visualisation. Interaction between immune scores and PAM50 subtypes was tested by likelihood ratio comparison of models with and without the interaction term. Immune cell scores were standardised first to yield per-standard-deviation hazard ratios. To develop multi-dimensional immune profiles, clusters were fitted via k-means with silhouette scoring. All p-values from multiple testing were corrected via Benjamini-Hochberg false discovery rate (FDR) and reported as FDR throughout. Overall survival was used for prognosis as the disease-specific survival data used by Curtis et al.8 was not available on cBioPortal.

The whole analysis is stored in this GitHub repository, including the environment.yml tracking the packages used, including ComplexHeatmap17,18, survival19,20, survminer21 and cBioPortalData22. This study is a secondary analysis of the publicly available METABRIC dataset accessed via cBioPortal under the Open Database License (ODbL v1.0), which contains de-identified data. No ethical approval or individual consent was required.

Results

Cohort Summary and Mutation Landscape

From the n = 2509 METABRIC cohort, 529 patients lacked expression data (Figure 1). These were significantly younger than the 1980 with expression data (median age 58 vs 61.8 years; Wilcoxon rank-sum p < 0.001). Not all patients had mutation data, resulting in varying analysis pools across sections. The further removal of patients with Claudin-low (n=218) and NC (Not Classified, n=6) left 1756 patients for further analysis. Immune cell scores were computed for 14 cell types using the approach from Danaher et al.12, forming the basis for all subsequent analyses.

In Table 1, of the 1756 patients, Luminal A was the largest group at 40%, with Basal-like and HER2-Enriched being smaller and also younger. Overall 41% survived, with HER2-Enriched having the lowest survival rate at 30%. Median follow-up length is almost 10 years, sufficient for long term survival analysis via Cox models.

NoteFigure 1 — Data availability across modalities

Intersection plot showing patient counts for each combination of available data modalities (N = 2,509).

Characteristic Overall
N = 1756 (100%)
Basal-like
N = 209 (12%)
HER2-Enriched
N = 224 (13%)
Luminal A
N = 700 (40%)
Luminal B
N = 475 (27%)
Normal-like
N = 148 (8.4%)
p-value
Age at diagnosis 62 (52, 71) 54 (44, 65) 59 (50, 69) 63 (53, 72) 67 (59, 74) 57 (46, 66) <0.001
ER status





<0.001
    Negative 339 (20%) 172 (84%) 115 (53%) 13 (1.9%) 11 (2.4%) 28 (19%)
    Positive 1,386 (80%) 33 (16%) 103 (47%) 677 (98%) 456 (98%) 117 (81%)
    Unknown 31 4 6 10 8 3
Tumour Grade





<0.001
    1 154 (9.2%) 2 (1.0%) 4 (1.9%) 118 (18%) 18 (3.9%) 12 (8.6%)
    2 714 (42%) 17 (8.3%) 54 (25%) 378 (57%) 186 (41%) 79 (57%)
    3 815 (48%) 187 (91%) 155 (73%) 172 (26%) 253 (55%) 48 (35%)
    Unknown 73 3 11 32 18 9
Cellularity





<0.001
    High 898 (52%) 124 (60%) 125 (59%) 326 (47%) 292 (63%) 31 (23%)
    Low 163 (9.5%) 28 (14%) 15 (7.0%) 53 (7.7%) 22 (4.7%) 45 (34%)
    Moderate 650 (38%) 54 (26%) 73 (34%) 312 (45%) 153 (33%) 58 (43%)
    Unknown 45 3 11 9 8 14
Follow-up (months) 116 (60, 184) 86 (33, 183) 97 (39, 172) 130 (82, 197) 104 (55, 164) 121 (64, 176) <0.001
Deceased 1,044 (59%) 115 (55%) 157 (70%) 378 (54%) 315 (66%) 79 (53%) <0.001
NoteTable 1 — Cohort characteristics by PAM50 subtype

Baseline characteristics of 1,756 METABRIC patients with expression data, stratified by PAM50 subtype. Continuous variables summarised as median (Q1, Q3); categorical as n (%). P-values from Kruskal-Wallis (continuous) or Pearson’s Chi-squared (categorical) tests.

Immune-mutation associations

Of the 149 genes mutated in more than 10 patients, 40 gene-cell type pairs reached FDR < 0.05 (Figure 2). However, only 11 remain after correcting for PAM50 subtypes. TP53 stood out as mutations in this gene significantly elevated immune cell scores in 8 cell types, seen in detail in Figure 3. These elevated cell types are broadly involved in the cytotoxic response, particularly CD8 T cells, cytotoxic cells and mature natural killer (NK) cells (NK CD56dim). This is consistent with TP53 mutations driving genomic instability and neoantigen production that trigger cytotoxic immune response. To investigate beyond mutation-driven effects on immune infiltration, subtype-driven effects were investigated next.

NoteFigure 2 — Mutation-immune cell score associations

Median immune cell score differences (mutated minus wild-type) for genes with at least one significant association across 14 cell types. Significance by Wilcoxon rank-sum test (FDR < 0.05); only genes mutated in ≥10 patients tested.

NoteFigure 3 — TP53 mutation and immune infiltration after PAM50 adjustment

PAM50-adjusted effect sizes from linear regression (x-axis) and significance (y-axis) for each immune cell type. Dashed line marks FDR = 0.05.

Immune Infiltration Across PAM50 Subtypes

Overall, immune infiltration varied across cell types and PAM50 subtypes as seen in Figure 4. 12 of 14 cell types showed significant differences across PAM50 subtypes (Figure 5, FDR < 0.05). Even within PAM50 subtypes, considerable heterogeneity remained across different cell types. The two cell types that did not differ significantly were NK and regulatory T (Treg) cells. For the significant cell types, several patterns emerged.

Among these, the lymphocyte populations (B cells, CD4 T cells, CD8 T cells, exhausted CD8 T cells, cytotoxic cells and mature NKs) followed the pattern of being enriched in HER2-Enriched and Basal-like subtypes, and relatively depleted in Luminal A and B subtypes, with no significant difference between the two high or the two low subtypes. Mast cells presented a different pattern, where HER2-enriched, Basal-like, Luminal A and B subtypes were all statistically different from each other. Strikingly, the Basal-like subtype had the lowest score out of all subtypes, a trend not seen in other cell types. CD45, the pan-leukocyte marker, showed naturally less distinct groupings but adhered to the overall trend of high infiltration in Basal-like tumours and low infiltration in Luminal A. Other cell types showed more heterogeneous patterns across the subtypes. Normal-like tumours did not show a consistent pattern across cell types, sharing similarities with different subtypes across cell types.

Given the broad and significant differences in immune infiltration among PAM50 subtypes, the impact of immune infiltration on survival was investigated next.

NoteFigure 4 — Immune cell score heatmap across PAM50 subtypes

Z-scored immune cell scores for 1,756 patients, split and hierarchically clustered by PAM50 subtype. Blue = below-mean; red = above-mean infiltration.

NoteFigure 5 — Immune cell scores by PAM50 subtype

Violin plots with box plots for cell types reaching Kruskal-Wallis FDR < 0.05. Lettering above each subtype plot indicate similarity/difference, subtypes sharing a letter do not differ significantly (pairwise Wilcoxon, FDR < 0.05).

Immune Scores and Overall Survival

Among the 1756 patients with complete survival data (1044 deaths), only two of the 14 cell types showed significance after FDR correction (Figure 6/Table 2). Mast cells, the cell type that showed uniquely low infiltration in Basal-like tumours, were associated with an 11.5% reduction in hazard (HR 0.885, FDR = 0.002), consistent with the protective role reported in other studies23. Conversely, macrophage scores were associated with a 9.1% increase in hazard (HR 1.091, FDR = 0.031). The remaining 12 cell types, including established tumour infiltrating lymphocytes (TILs) such as T cells and NK cells did not show significance after FDR correction. However, given that immune scores covary strongly with PAM50 subtypes, and PAM50 is itself prognostic, these univariate models could be confounded. The true prognostic effects of these cell types could be masked by PAM50 subtypes. Even Mast cells, the best-performing cell type, achieved a concordance (C-index) of only 0.55, indicating modest discriminative ability.

This prompted further PAM50-adjusted analysis to discern the immune-infiltration-driven effects from PAM50-subtype-driven effects on survival differences.

Cell type HR Lower 95% CI Upper 95% CI p-value C-index FDR
Mast cells 0.885 0.831 0.942 <0.001 0.550 0.002
Macrophages 1.091 1.028 1.159 0.004 0.543 0.031
B cells 0.931 0.872 0.993 0.029 0.512 0.102
Treg 1.071 1.009 1.138 0.025 0.527 0.102
CD8 T cells 0.939 0.883 0.999 0.047 0.508 0.131
Cytotoxic cells 0.948 0.891 1.009 0.093 0.499 0.217
NK cells 1.051 0.988 1.119 0.115 0.510 0.230
Neutrophils 1.041 0.982 1.103 0.174 0.514 0.287
Th1 cells 1.042 0.980 1.108 0.185 0.515 0.287
CD4 T cells 1.025 0.968 1.086 0.395 0.512 0.554
Exhausted CD8 0.980 0.923 1.041 0.519 0.493 0.660
Mature NK 0.989 0.931 1.050 0.708 0.484 0.826
CD45 0.992 0.931 1.057 0.804 0.491 0.860
DC 1.006 0.945 1.070 0.860 0.518 0.860
NoteFigure 6 / Table 2 — Univariate Cox regression: immune scores and overall survival

Hazard ratios per SD increase from univariate Cox proportional hazards models for each immune cell type (FDR < 0.05).

PAM50-Adjusted Immune Prognostic Analysis

After adjusting for PAM50 subtype confounding via multivariate Cox models, 4 of 14 cell types showed significant effects on survival compared to 2/14 pre-adjustment. B cells, cytotoxic cells and CD8 T cells gained significance while macrophages lost significance, mast cells retained significance, but their effects were less pronounced after adjustment. As seen in Figure 7, all four cell types showed protective hazard ratios: B cells (HR = 0.889, FDR = 0.011), cytotoxic cells (HR = 0.903, FDR = 0.018), CD8 T cells (HR = 0.914, FDR = 0.024), mast cells (HR = 0.911, FDR = 0.024). Macrophages losing significance even before FDR adjustment suggests that significance shown prior was likely due to PAM50 subtype confounding. Kaplan-Meier curves splitting patients at median cell scores confirmed significant survival separation for all four cell types (Figure 8, log-rank p < 0.05).

Delving deeper for each significant cell type, per-subtype effects revealed which subtypes drove the overall associations, except for Normal-like tumours, which had consistently wide confidence intervals across cell types, likely due to its small population. As the overall adjusted effects were protective for all significant cell types, no subtypes showed significantly adverse effects. For B cells, both Basal-like and HER2-enriched subtypes showed significant protective ratios, while the Luminal B subtype trended in the opposite direction, though this did not reach significance. For cytotoxic cells and CD8 T cells, only the HER2-enriched subtype showed significant protective ratios. Mast cells, the only cell type that was also significant in the univariate analysis, showed significant protective effects in the Luminal A subtype instead.

Further analysis via interaction models confirmed that per-subtype variation was not statistically significant after FDR correction, although B cells showed borderline heterogeneity (p = 0.047, FDR = 0.189), consistent with the opposing trend observed in the Luminal B subtype. The remaining three cell types (cytotoxic cells, CD8 T cells, mast cells) showed no statistically significant heterogeneity, suggesting the protective effects were broadly consistent across subtypes.

Even if four cell types showed improved survival after adjusting for PAM50 subtype, concordance was still modest, with C-index staying below 0.6 (Table 3). As an attempt to extract more prognostic value from immune cell scores, immune cell scores were aggregated via clustering and compared with single immune cell scores.

Cell type HR Lower 95% CI Upper 95% CI p-value FDR C-index
B cells 0.889 0.830 0.952 <0.001 0.011 0.587
Cytotoxic cells 0.903 0.845 0.965 0.003 0.018 0.582
CD8 T cells 0.914 0.858 0.974 0.006 0.024 0.582
Mast cells 0.911 0.852 0.975 0.007 0.024 0.584
Mature NK 0.925 0.865 0.989 0.022 0.061 0.579
Treg 1.066 1.003 1.132 0.038 0.089 0.582
Exhausted CD8 0.942 0.883 1.005 0.069 0.139 0.578
NK cells 1.056 0.992 1.123 0.086 0.151 0.578
Neutrophils 1.046 0.987 1.108 0.132 0.205 0.581
Th1 cells 1.045 0.983 1.110 0.160 0.208 0.580
Macrophages 1.045 0.982 1.111 0.164 0.208 0.582
CD45 0.973 0.913 1.037 0.394 0.459 0.573
DC 0.978 0.917 1.043 0.499 0.537 0.575
CD4 T cells 1.006 0.949 1.066 0.852 0.852 0.577
NoteFigure 7 / Table 3 — PAM50-adjusted Cox regression and per-subtype effects

Cox proportional hazards models adjusted for PAM50 subtype. Forest plot shows the 4 cell types reaching FDR < 0.05; table shows all 14.

NoteFigure 8 — Kaplan-Meier survival curves for PAM50-adjusted significant cell types

Patients split at median score. P-values from log-rank test; titles show PAM50-adjusted HR and FDR.

Aggregating Immune Scores via Clustering

Two clustering approaches were employed and compared: one using all 14 cell type scores, and one restricted to the four cell types that reached prognostic significance after PAM50 adjustment, to test whether focusing on prognostically relevant cell types would improve clustering performance. Both models were also compared against single cell scores and a model with only PAM50 as the baseline.

The two clustering profiles (Figure 9) revealed clear hot vs cold splits for most cell types, including the prognostic cell types revealed by PAM50 subtype adjustment: B cells, CD8 T cells, cytotoxic cells. One notable exception was mast cells, which failed to split, showing near-zero mean z-scores in both clusters in both profiles.

Comparing all the models together against the PAM50 only baseline model (Table 4), all single cell score models significantly improved on the PAM50 baseline model, with the B cell model gaining the most (∆AIC = -10.1, p < 0.001). Both clustering models failed to improve on the baseline model, with the 4-significant-cell model showing non-significant gains (p = 0.086, ∆AIC = -0.9), and the all-14-cell model performing worse than the baseline (p = 0.314, ∆AIC = 1.0).

The failure of the clustering models could be explained by the lack of clear splitting of the mast cells. Mast cells had shown prognostic relevance, but their variation deviated from the main pattern of immune-hot-cold split captured by the clustering, diluting their signal when aggregated.

NoteFigure 9 — Immune cluster centroids

Mean z-scored immune cell type profiles for k-means clusters.
Left: 2 clusters from all 14 cell types.
Right: 2 clusters from the 4 PAM50-adjusted significant cell types only.
Clusters labelled by overall immune level. Red = above-average infiltration; blue = below-average.

Model df Concordance AIC LR p vs PAM50 ΔAIC
PAM50 subtype only 4 0.576 14057.6 0.0
PAM50 + B cells 5 0.587 14047.5 <0.001 -10.1
PAM50 + Cytotoxic cells 5 0.582 14050.3 0.002 -7.3
PAM50 + CD8 T cells 5 0.582 14051.8 0.005 -5.8
PAM50 + Mast cells 5 0.584 14052.2 0.007 -5.4
PAM50 + Cluster (4 sig, k=2) 5 0.577 14056.6 0.086 -0.9
PAM50 + Cluster (all 14, k=2) 5 0.576 14058.5 0.314 1.0
NoteTable 4 — Prognostic comparison: PAM50 baseline vs immune-augmented models

LR test p-values against PAM50-only baseline. ΔAIC relative to baseline (negative = better fit).

Discussion

Survival analysis using univariate and multivariate approaches elucidated a complex immune landscape in breast cancers. Without adjusting for PAM50 subtypes, only mast cells and macrophages showed significance. But after adjustment, macrophages lost significance entirely while B cells, cytotoxic cells and CD8 T cells gained prognostic significance. This pattern of prognostic gain and loss depending on subtype adjustment is consistent with existing CD8 literature, where CD8 infiltration is only prognostic in ER-negative patients24 and in Basal-like tumours25. Ali et al.11 found similar subtype-dependent effects using CIBERSORT across approximately 11,000 tumours. These findings indicate that immune prognostic value in breast cancer is cell-type-specific and subtype-dependent, rather than a uniform infiltration effect.

B cells emerged as the strongest prognostic cell type after PAM50 adjustment. This is consistent with recent findings across cancer types, where B cell signatures have been associated with favourable outcomes and the presence of tertiary lymphoid structures (TLS), organised immune aggregates rich in B cells26,27. Petitprez et al. notably found B cells to be the strongest prognostic factor in sarcoma even after accounting for CD8 and cytotoxic cell levels27, mirroring the independent B cell signal observed here.

Mast cells showed the opposite infiltration pattern. The three lymphocyte populations that gained significance after adjustment (B cells, cytotoxic cells and CD8 T cells) were most abundant in Basal-like and HER2-Enriched tumours. The inverse was true for mast cells, showing enrichment in the two Luminal subtypes instead. This is in line with existing findings of higher mast cell density in Luminal than in triple-negative and HER2+ tumours seen via immunohistochemistry28. This could reflect a distinct immune recruitment biology. Mast cells are recruited via the SCF/c-Kit pathway, where tumour cells produce stem cell factor (SCF) that binds c-Kit receptors on mast cells. In ER-positive tumours, estrogen signalling upregulates KITLG, the gene encoding SCF, creating a hormone-driven recruitment circuit absent in ER-negative subtypes29. This independent recruitment pathway explains why mast cell variation did not follow the lymphocyte gradient in the clustering analysis, where k-means captured the dominant lymphocyte axis but failed to represent mast cell variation. The mast cell protective association is also consistent with another large independent cohort (n=4,444)23.

The side-by-side comparison of univariate and PAM50-adjusted models explicitly quantified confounding rather than just assuming it, revealing how subtype adjustment reshapes the prognostic landscape. All 14 cell types were screened with FDR correction, avoiding selective reporting. The immune cell marker genes12 used were validated against flow cytometry, allowing direct linkage between gene expression and each cell score. Still, several limitations should be noted. The METABRIC expression data were generated on the Illumina HT-12 v3 microarray, an older platform that captured 58 of 60 immune marker genes. Moreover, certain cell types relied on single-gene markers (Table S1), which may limit sensitivity for these cell types. Validation in RNA-seq cohorts such as TCGA-BRCA would address both gaps. Treatment type data were available on cBioPortal but were not modelled as covariates in this analysis. As treatment assignment in breast cancer is subtype-dependent, treatment effects on immune infiltration and prognosis remain an unaddressed confounder warranting treatment-stratified analysis. Finally, the gene expression data were at a single timepoint, failing to capture the immune dynamics over the course of the disease. Circulating tumour DNA profiling could enable longitudinal immune monitoring to add temporal depth to the analysis. Applying CIBERSORT to the same cohort with the same PAM50-adjusted pipeline would also test whether the confounding findings are robust to different scoring methods.

Conclusion

Using immune cell marker genes from Danaher et al., 14 immune cell types were quantified across the METABRIC breast cancer cohort and evaluated for prognostic value. Immune infiltration varied substantially across PAM50 subtypes, with Basal-like and HER2-Enriched tumours showing enriched lymphocytes. Immune-mutation associations were largely confounded by PAM50 subtype, with TP53-mutated tumours driving most of the infiltration signals. Univariate and multivariate survival analysis identified B cells, CD8 T cells, cytotoxic cells and mast cells as independently prognostic, with macrophages reaching significance only in univariate analysis but losing prognostic value after PAM50 adjustment. All independently prognostic cell types showed protective associations, with B cells being the strongest. Mast cells followed a distinct infiltration pattern, enriched in Luminal rather than Basal-like tumours, consistent with a hormone-driven recruitment pathway independent of the lymphocytic axis. Clustering immune scores failed to improve prognostic value beyond individual cell type scores, suggesting that single independent cell types could be more informative than aggregated profiles when subtype is accounted for.

References

1.
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: The next generation. Cell 144, 646–674 (2011).
2.
Gil Del Alcazar, C. R., Alečković, M. & Polyak, K. Immune escape during breast tumor progression. Cancer Immunology Research 8, 422–427 (2020).
3.
4.
5.
Stanton, S. E. & Disis, M. L. Clinical significance of tumor-infiltrating lymphocytes in breast cancer. Journal for ImmunoTherapy of Cancer 4, 59 (2016).
6.
Perou, C. M. et al. Molecular portraits of human breast tumours. Nature 406, 747–752 (2000).
7.
Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. Journal of Clinical Oncology 27, 1160–1167 (2009).
8.
9.
10.
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nature Methods 12, 453–457 (2015).
11.
Ali, H. R., Chlon, L., Pharoah, P. D. P., Markowetz, F. & Caldas, C. Patterns of immune infiltration in breast cancer and their clinical implications: A gene-expression-based retrospective study. PLOS Medicine 13, e1002194 (2016).
12.
Danaher, P. et al. Gene expression markers of tumor infiltrating leukocytes. Journal for ImmunoTherapy of Cancer 5, 18 (2017).
13.
14.
15.
16.
Fougner, C., Bergholtz, H., Norum, J. H. & Sørlie, T. Re-definition of claudin-low as a breast cancer phenotype. Nature Communications 11, 1787 (2020).
17.
Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).
18.
Gu, Z. Complex heatmap visualization. iMeta https://doi.org/10.1002/imt2.43 (2022) doi:10.1002/imt2.43.
19.
Therneau, T. M. A Package for Survival Analysis in R. (2026).
20.
Therneau, T. M. & Grambsch, P. M. Modeling Survival Data: Extending the Cox Model. (Springer, New York, 2000).
21.
Kassambara, A., Kosinski, M. & Biecek, P. Survminer: Drawing Survival Curves Using ’Ggplot2’. (2025). doi:10.32614/CRAN.package.survminer.
22.
Ramos, M. et al. Multiomic integration of public oncology databases in Bioconductor. JCO Clinical Cancer Informatics 4, 958–971 (2020).
23.
Rajput, A. B. et al. Stromal mast cells in invasive breast cancer are a marker of favourable prognosis: A study of 4,444 cases. Breast Cancer Research and Treatment 107, 249–257 (2008).
24.
Ali, H. R. et al. Association between CD8+ t-cell infiltration and breast cancer survival in 12,439 patients. Annals of Oncology 25, 1536–1543 (2014).
25.
26.
Helmink, B. A. et al. B cells and tertiary lymphoid structures promote immunotherapy response. Nature 577, 549–555 (2020).
27.
Petitprez, F. et al. B cells are associated with survival and immunotherapy response in sarcoma. Nature 577, 556–560 (2020).
28.
29.
Harrell, J. C., Shroka, T. M. & Jacobsen, B. M. Estrogen induces c-kit and an aggressive phenotype in a model of invasive lobular breast cancer. Oncogenesis 6, 396 (2017).

Addendum

Even if my day job is somewhat cancer adjacent, diving deep into this METABRIC dataset made me feel like a novice when it came to cancer / breast cancer. But it has been fun seeing what we learn in class come together in more ways than one. The lead authors for the immune cell scoring paper are from the company NanoString Technologies, the same company that has the commercial PAM50 test kit that we learned in class. And in the same paper, they cite the false discovery rate (FDR) correction paper from Benjamini and Hochberg!

This whole assignment has been a rollercoaster ride, and there were so many rabbit holes that this topic would lead you into. For example Danaher didn’t make it easy for me when they tweaked the TIL acronym, usually meaning tumour infiltrating lymphocytes, to sneak in leukocytes in the palce of lymphocytes, so that the immune cells of myeloid origin could be included in the paper. The inclusion of Claudin-low in the dataset also sent me down a rabbit hole on what is this new “subtype”. More good chills-inducingly, the lead author for one of the METABRIC-CIBERSORT papers (and the CD824), Dr. Raza Ali, is from the same group that gave us the METABRIC paper, and he is actually here in Cambridge right now! Would send him an email once I recover from this assignment.